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Thla  paper  reporta  on  a  preliminary  atudy  of 
applying  alngle-channel  (scalar)  and  eultlchannel 
(vector)  2-D  linear  predict loo  to  eolor  in  age 
■odellng  and  coding.  Alan,  the  novel  idea  of  a 
multi- input  single-output  2-0  ABfCN  coder  is 
introduced.  The  results  of  thla  study  Indicate 
that  texture  Information  in  aultispaetral  Images 
can  be  represented  by  linear  prediction  coeffi¬ 
cients  or  matrices,  vhereaa  the  prediction  error 
conveys  edge- Information.  Moreover ,  by  using  a 
single-channel  edge- information  ve  obtained,  fron 
original  colof  Images  of  24  blts/plxel,  recon¬ 
structed  images  of  good  quality  at  info  neat  ion 
rates  of  1  blt/plxel  or  leas. 


Two-dimension el  linear  prediction  was  suc¬ 
cessfully  applied  to  coding  nonoebrane  Images  at 
rates  below  1  blt/plxel  (1,2)  and  to  clustering 
homogeneous  image  textures  by  using  2-D  LK  dis¬ 
tances  (31.  Motivated  by  the  above  success  of 

2- D  linear  prediction,  ws  tried  to  extend  its  use 
to  nultlapectral  images  either  by  autoregresslve- 
ly  modeling  each  channel  separately  or  by  using  s 
vector  2-D  linear  predictor  which  exploits  cross- 
correlation  between  channels.  These  two 
approaches  ressenble  the  notions  of  component  and 
composite  encoding  methods  for  color  video  sig¬ 
nals  [4].  A  major  contribution  of  this  paper  is 
the  Introduction  of  a  multi-input  single-output 
ADRCM  coder  whose  output  will  be  a  single-channel 
edge-information  signal!  this  reflects  the  idea 
that  for  most  natural  color  images  the  edges 
occur  at  approximately  the  same  location  in  every 
channel.  Although  our  results  refer  only  to 

3- channel  color  images  (red,  green,  blue),  our 
theoretical  formulation  addresses  the  general 
case  of  an  M-channel  multiapeetral  image. 

mu  icjumu.  a-®  ubu  nmmci 

Let  m(m,n)  *  (x1(a,n),...,x„(a,n)]T  repre¬ 
sent  an  ■ -channel  2-D  image  vector  signal,  where 
( •)  denotes  the  transpose  of  s  vector  and 
xjta.n)  represents  a  sine : e-channel  scalar  2-D 
sequence  of  image  intensity  in  a  certain  spectral 


band.  By  exploiting  the  autocorrelation  of  every 
channel  and  the  cross-correlation  between 
channels,  we  formulate  the  following  2-D  vector 
autoregressive  model  for  m(m,n)t 


*<n,n)  m  l  l  h(t,t)x(m-k,n-i)  ♦  b  ♦  e(m,n)  (1) 
k  t 


»d>ere  we  prediet  the  vector  x(m,n)  from  its 
neighbor  vector  values  weighted  by  ‘predict ion 
matrices*  A(k,i)  of  order  Mm.  In  (1),  (k,i) 
range  over  all  Integer  pairs  in  a  set  S,  called 
the  region  of  support  of  the  prediction  mast,  and 
this  set  determines  whether  the  mask  is  causal, 
quarter-plane,  etc.  The  cauaallty  of  the  pre¬ 
diction  mask  is  necessary  for  .he  recursive 
computability  of  (1).  The  bleu  vector  b  - 
tb, , . . .  .bjj]*  accounts  for  the  fact  that  the  in¬ 
tensity  image  samples  are  explicitly  biased  by  a 
de-level  vector  4  -  Id, , . . .  ,d,,)T  since  they  are 
always  nonnegative.  The  2-D  vector  prediction 
error  signal  e(m,n)  is  the  output  of  a  N*S  matrix 
prediction  error  filter 


»(*,,*.)  -  X  -  I  I»(k,i)«.V*  (2) 

k  1 

when  the  input  is  x(m,n)  and  where  I  denotes  the 
Mm  identity  matrix.  The  relation  between  b  and 
d  is 


b  -  [l  -  I  l  A(k,i)  ]d  (3) 

k  i 

Consider  the  Mm  average  prediction  error  matrix 

•  ■II  e(*,n)eT(m,n)  (4) 

m  n 

In  (4),  (a,n)  range  over  all  integer  pairs  corre¬ 
sponding  to  pixel  locations  inside  some  region  of 
support  of  x(m,n)  which  we  call  the  analysis 
frsme.  Th-  l-tb  diagonal  entry  of  the  matrix  B 
represents  the  mean-squared  prediction  error  in 
the  i-th  channel.  The  criterion  to  find  the 
optimal  parameters  U(k,i),  b)  of  the  model  is  to 
minimise  the  trace  of  B,  The  inclusion  of  h  in 
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the  unknown  puaxttri  guarantees  that  the  pre¬ 
diction  arror  *(a,n)  will  ba  a  2-D  sercr-sean 
vector  sequence.  The  normal  equation a  are i 

I  I  *(i. Jik,l)lT (k,i)*a(i, j) bT-*(i, J:0,0)  (5a) 

k  t 


l  l  aT(k,l)hT(k,l)  ♦  ■  -b*  -  aT (0,0)  (5b) 

k  1  * 

where  we  obaeree  the  matrix  correlation  and  vec¬ 
tor  ahift  laqa  respectively: 


♦(k,i:i,j)  -  l  l  x(n-k,n-l)xT (m-i,n-j)  <«a) 


a(k,i)  -  l  l  «(m-k,n -t)  (6b) 


In  (5),  (k,t)  and  (i,j)  range  over  the  net  1.  In 
(Oi  (m,n)  range  over  the  analyais  frame r  and  I, 
in  (5b)  denotea  the  number  of  aamples  inaide  the 
analysis  fra*. 

An  alternative  way  of  modeling  x(n,n)  would 
be  to  autoregreesively  model  each  channel 
aeparatelyi  *> 


*.  (m.n)  -  l  l  a  (k,i)s  (*-k,n-I)eb  ♦*  (*,n) 

1  HI1  1  1  1  (7) 

for  1-1,2,...,*,  where  the  optimal  acalar  linear 
prediction  aoefflcienta  a, ( k , t)  and  biaa  coeffi¬ 
cient  bj  are  obtained  by  minimising  the 
mean-squared  value  of  the  acalar  prediction  error 
signal  *,(m,n)  over  the  analyais  frame,  as  ex¬ 
plained  in  [1,2].  Obviously  the  scalar  models  in 
(7)  are  a  subcase  of  the  vector  model  in  (1)  with 
the  prediction  matrices  A(k,  1)  being  diagonal. 

One  approach  to  compute  the  correlation  and 
shift  lags  in  (()  is  to  assusm  the  vector  image 
signal  to  be  aero  outside  the  analysis  frame, 
which  is  similar  to  the  autocorrelation  method  of 
1-D  linear  prediction.  Alternatively,  samples  on 
the  borders  of  the  frame  could  be  supplied  as 
needed  in  the  computation  of  <6 )  y  this  latter 
approach  la  called  the  covariance  method.  The 
covariance  method  gives  better  estimates  of  the 
predletor  parameters  and  of  the  bias,  and  a 
smaller  mean-squared  prediction  error  than  tha 
autocorrelation  method.  However,  neither  method 
can  guarantee  stability  of  either  the  resulting 
aealar  or  matrix  autoregressive  models. 

The  stability  of  the  matrix  filter 
l/rtSjfSj)  is  necessary  for  the  stable  re¬ 
construction  of  s(m,n)  from  the  prediction 
matrices,  the  bias,  and  the  prediction  error  sig¬ 
nal  e(m,n).  This  stability  la  equivalent  to  the 
acalar  2-D  polynomial  being  minimum 
phase,  where  "detlM*  «?»  •«*  determinant  of  a 
matrix.  With  the  rove ■  r:.  rtthod,  the  estima¬ 
tion  of  the  bias  inter  e-.-t,  • :  the  stability  in 


the  following  wayi  Prom  (2)  and  (3)  we  infer 
that  b  •  P(1,1)d.  Therefore,  if  the  image  signal 
has  a  nonsero  dc-level  (d*0)  and  we  arbitrarily 
require  bmO  in  (5),  then  we  force  the  determinant 
of  P(1 ,1 )  to  become  sero,  which  forces  the  model 
to  be  marginally  unstable  since  det [P(l , 1 )  ]-0 
corresponds  to  a  pole  on  the  unit-surface . 
Moreover,  as  we  proved  in  f 21 ,  it  the  prediction 
mask  has  a  quarter-plane  region  of  support,  then 
a  necessary  condition  tor  stability  is 

det [P (1,1)]  >  0  (8) 

Finally,  if  we  use  the  autocorrelation 
method  with  a  2-D  separable  prediction  mask,  then 
the  stability  of  the  Inverse  prediction  error 
filter  is  guaranteed  in  both  the  scalar  and  the 
vector  cases. 

KDLfnOBMB.  ADWOM  0000*3 

We  used  the  above  theoretical  formulation  of 
2-D  linear  prediction  for  the  design  of  the  pre¬ 
dictors  in  the  feedback  loops  of  an  ADPCM  image 
coding  scheme  of  the  feed-forward  type. 
Initially,  each  channel  of  the  multichannel  image 
was  coded  separately  using  a  single-input  single- 
output  ADPCM,  as  described  in  (1,2),  at  am 
average  information  rate  of  1  bit/pixel  or 
leas.  This  resulted  in  a  bit  rate  of  about  I 
bits/pixel  for  an  W-channel  color  image. 
However,  since  our  interest  was  in  much  lower  bit 
rates  and  because  we  wanted  to  exploit  correla¬ 
tion  between  channels,  we  used  the  multi-input 
single-output  ADPCM  scheme  shown  in  Pig.  1. 

The  philosophy  of  each  feedback  loop  in  Pig. 
1  is  that  for  the  1-th  channel  the  Pj  predletor 
forms  an  estimate  from  jwst  samples  of  the  recon¬ 
structed  image  signal  x^(m,n).  This  estimate  is 
subtracted  from  the  incoming  image  signal  a, (m.n) 
to  form  the  difference  signal  d,(m,n)  which  is 
quantised  and  encoded  into  the  2-4)  signal  c(n,n) 
for  transmission.  At  the  receiver,  the  quantised 
difference  signal  di(m,n)  excites  the  1-th  in¬ 
verse  prediction  error  filler  to  produce  the 
reconstructed  image  signal  x1(m,n)  for  the  i-tb 
channel. 

The  design  of  the  multi-input  single-output 
quantiser  Q  in  Pig.  1  is  governed  by  the  intui¬ 
tion  that  for  most  natural  oolor  images  the  edges 
occur  at  approximately  the  same  location  in  every 
channel.  The  edge- information  in  the  1-th 
channel  is  conveyed  mainly  by  the  prediction 
error  signal  e^(m,n).  However,  assuming  small 
quantisation  errors,  the  difference  signal 
djlm.n)  approximate*  e^m.n).  Therefore  an  en¬ 
coded  quantised  difference  signal  would  contain 
mainly  information  about  the  edge-location.  This 
is  depicted  in  Pig.  2  where  the  Dinary  images 
(a),  (b) ,  (e)  show  tha  encoded  quantised 
(3-levela/plxal)  difference  signalr  of  the  red, 
green,  and  blue  channel  separate!)  lot  a  bead  and 
shoulders  lmsge  with  well  define  .does.  Tbs 
binary  image  of  Pig.  2(d),  however .  r  mwr  the 
2-levela/pixel  common  encode  a...  dif¬ 
ference  signal  which  is  u  of  tha 


s 


multi- Input  single-output  quantiser  of  Fig.  1. 
fey  comparing  tha  images  of  Pig.  2,  wa  realise 
that  by  using  a  single-channel  for  intonation 
about  edge- location  we  do  not  loose  many  edges. 
The  encoded  signal  c(m,n)  was  formed  by  first 
finding  a  single-channel  difference  signal i 
■ 

d(a,n)  •  l  w, *d.(a,n)  (9) 

1-1  1  1 

where  the  w.'s  are  weighting  coefficients,  and 
then  quantising  and  encoding  d(a,n)  as  follows i 


1 

c(a,n)  -  0 
-1 


,  d(a,n)  >  « 

,  -  0  <  d(m,n)  <  S 

,  d(a,n)  <  -  6 


(10) 


The  encoded  signal  c(a,n)  represents  tha  sequence 
of  codewords.  The  quantised  difference  signals 
sre  determined  as  follows i 


dj(a,n)  -  c(a,n)  *Aj  ,1*1, 2,. ..,N  (11) 

The  threshbold  6  la  (10)  and  the  step  sises  A,  la 
(11)  sre  adapted  over  each  Na  analysis  frame  of 
the  Image  according  to  the  rulei 


•  -  t*c_ 


A.  •  D*d 

•  *i 


(12) 


where  e  .  la  the  rns  value  of  the  1-th  prediction 
error  signal  ei(a,n)  in  the  analysis  frame, 
and  o  is  the  rms  value  of  a  single-channel  pre¬ 
diction  error  signal  formed  by  e  linear 
combination  of  all  tbs  e,(»,n)  using  tbs  asms 
weighting  ooefflclenta  as  in  (9).  The  constants 
S  and  0  are  determined  empirically  (1,2).  The 
3-level  quantisation  logic  of  (10)  allows  us  to 
aat  <M>  and  thus  quantise  the  difference  signal 
with  1-blt  fised  length  codewords.  alterna¬ 
tively,  If  0*0,  by  adjusting  *  we  can  produce  at 
the  output  of  the  guantlser  a  large  percentage  of 
sero  levels  which  will  reduce  significantly  the 
entropy  of  the  quantised  difference  signal  and 
enable  ua  to  use  Huffman  codewords  of  var labia 
length  in  order  to  achieve  an  average  bit  rate  of 
much  less  than  1  blt/pixel. 


In  addition  to  the  encoded  quantised  dif¬ 
ference  elgnal,  we  muat  transmit  to  the  receiver 
■side-information*  about  the  predictor  para¬ 
meters,  the  hies  and  the  step  else.  The 
predictors  P^  in  Pig.  1  are  designed  either  as 
scalar  predictors  (with  prediction  coefficients 
operating  on  tha  1-th  channel)  or  i>  vector  pre¬ 
diction  (with  predictor  matrices  ore- r.-.ing  on  all 
tbs  channels  simultaneously).  Qnfcrtur.ately,  the 
issue  of  stability  and  the  limited  available 
mathematical  tools  fee  2-0  polynomials  limit  our 
choices  among  various  approaches.  Per  scalar 
predictors  tha  autooorralation  method  with  a  2-0 
aepatabla  prediction  mask  guarantees  stability 
and  It  allows  us  to  quantise  the  prediction  coef¬ 
ficients  In  the  domain  of  the  lou-araa-tatloa, 
exactly  as  done  with  MC  coding  of  speech,  al¬ 
ternatively,  we  cam  use  the  'stabilised* 
cover lenoe  method  with  a  aon-separ  able  2-0  mask. 


as  explained  in  (1,2),  and  use  a  logarithmic 
quantizer  to  quantise  tha  coefficients  inside  a 
fixed  range.  For  vector  predictors,  wa  can  use 
tha  autocorrelation  method  with  a  2-0  separable 
mask  for  guaranteed  stability.  The  quantization 
of  the  entries  of  the  resulting  prediction 
matrices  la  still  under  investigation.  The  com¬ 
ponents  of  the  bias  vector  d  and  the  step 
sises  A ^  are  quantised  by  using  log-quantisers. 


He  successfully  applied  the  multichannel 
adaptive  prediction  AD PCX  coding  to  color  aerial 
photographs  and  bead  and  shoulders  images.  These 
color  images  had  only  3  channels  (red,  green  and 
blue)  with  a  total  resolution  of  24  blts/plxel. 
The  analytia  frames  consisted  of  16*16  or  32*32 
pixala.  The  prediction  masks  had  a  quarter-plans 
region  of  support  with  2 >6  or  3>0  samples  in 
actant,  by  coding  each  channel  aeparataly  at  1 
blt/pixel  or  leas,  color  reconstructed  images  of 
high  quality  resulted  at  a  rata  of  -3  blts/plxel 
or  leaa.  By  using  a  multi-input  single-output 
AD  PCX  with  adaptive  scalar  prediction  and  3-level 
quantisation  color  reconstructed  images  of  good 
quality  resultsd  at  a  total  rate  of  *1  blt/pixel 
or  leas  (down  to  *0.d  blt/pixal).  These  rates 
correspond  to  compression  factors  of  about  24i1 
or  mors.  The  nixing  of  the  different  channels  in 
Iq.  (9)  was  done  by  using  aa  weighting  coef¬ 
ficients  0.3,  0.6  and  0.1  for  tha  red,  green  and 
blue  channel  respectively,  since  the  green  color 
is  the  moat  important  and  the  blue  is  the  least 
important  for  edge-content  (4). 

By  using  multichannel  AD PCX  with  adaptive 
matrix  (instead  of  scaler)  predictors  we  obtained 
coded  images  whose  quality  was  similar  to  the 
quality  of  the  images  coded  by  using  scalar  pre¬ 
dictors.  Since  matrix  linear  prediction  gives  a 
smaller  prediction  error  residual  than  scalar 
linear  prediction,  we  ere  continuing  to  investi¬ 
gate  ways  of  achieving  higher  image  quality  using 
matrix  predictors. 
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